An STD System Using Multiple STD Results and Multiple Rescoring Method for NTCIR-12 SpokenQuery&Doc Task

نویسندگان

  • Ryota Kon'no
  • Kazuki Ouchi
  • Masato Obara
  • Yoshino Shimizu
  • Takashi Chiba
  • Tatsuro Hirota
  • Yoshiaki Itoh
چکیده

Researches of Spoken Term Detection (STD) have been actively conducted in recent years. The task of STD is searching for a particular speech segment from a large amount of multimedia data that include audio or speech data. In NTCIR-12, a task containing multiple spoken queries is newly added to the STD task. In this paper, we explain an STD system that our team developed for the NTCIR-12 SpokenQuery & Doc task. We have already proposed the various methods to improve the STD accuracy for out-ofvocabulary (OOV) query terms. Our method consists of four steps. First, multiple automatic speech recognizers (ASRs) are performed for spoken documents using triphone, syllables, demiphone and SPS and multiple speech recognition results are obtained. Retrieval results are obtained for each subword unit. Second, these retrieval results are integrated [1][2]. Third, we apply a rescoring method to improve the STD accuracy that contains highly ranked candidates [3]. Lastly, a rescoring method is applied to compare a query with spoken documents in more detail by using the posterior probability obtained from Deep Neural Network (DNN) [4]. We apply this method to only the top candidates to reduce the retrieval time [5]. For a spoken query, we use two rescoring methods. First method compares two posterior probability vectors of the spoken query and spoken documents. Second method utilizes the papers in proceedings. We apply these methods to the test collection of NTCIR-12 and show experimental results for these methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of DNN-based Phoneme Estimation Approach on the NTCIR-12 SpokenQuery&Doc-2 SQ-STD Subtask

This paper proposes a correct phoneme sequence estimation method using a deep neural network (DNN)-based framework for spoken term detection (STD). We use a DNN architecture as a correct phoneme estimator. The DNN-based estimator estimates a correct phoneme sequence of an utterance from some sorts of phoneme-based transcriptions produced by multiple ASR systems in post-processing, for reducing ...

متن کامل

Spoken Document Retrieval Experiments for SpokenQuery&Doc at Ryukoku University (RYSDT)

In this paper, we describe spoken document retrieval (SDR) systems in Ryukoku University, which were participated in NTCIR-11 “SpokenQuery&Doc” task. In NTCIR-11 SpokenQuery&Doc task, there are subtasks: “spoken content retrieval (SCR) subtask” and “spoken term detection (STD) subtask”. We participated in the SCR and STD subtasks as team RYSDT. In this paper, our SDR and STD systems are described.

متن کامل

STD Score Combination with Acoustic Likelihood and Robust SCR Models for False Positives: Experiments at NTCIR-11 SpokenQuery&Doc

In this paper, we report our experiments at NTCIR-11 SpokenQuery&Doc task [1]. We participated both the STD and SCR subtasks of SpokenDoc. For STD subtask, We try to improve detection accuracy by combining the DTW distance between syllable sequences and the acoustic likelihood of the detected speech segment. The final combined score, which is obtained by applying logistic regression on the, was...

متن کامل

Graph-based Document Expansion and Robust SCR Models for False Positives: Experiments at the NTCIR-12 SpokenQuery&Doc-2

In this paper, we report our experiments at NTCIR-12 Spoken Query&Doc-2 task. We participated spoken query driven spoken content retrieval (SQ-SCR) subtasks of Spoken Query&Doc2. We submited two types of results, which are conventional spoken content retrieval method (referred to as C-SCR) and STD based approach for SCR (referred to as STD-SCR). The latter was proposed in order to deal with spe...

متن کامل

Overview of the NTCIR-12 SpokenQuery&Doc-2 Task

This paper presents an overview of the Spoken Query and Spoken Document retrieval (SpokenQuery&Doc-2) task at the NTCIR-12 Workshop. This task included spoken query driven spoken content retrieval (SQ-SCR) and a spoken query driven spoken term detection (SQ-STD) as the two subtasks. The paper describes details of each sub-task, the data used, the creation of the speech recognition systems used ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016